A Mixed-effects Model for Incomplete Data from Labeling-based Quantitative Proteomics Experiments By
نویسندگان
چکیده
In mass spectrometry(MS-) based quantitative proteomics research, the emerging iTRAQ (isobaric tag for relative and absolute quantitation) and TMT (tandem mass tags) techniques have been widely adopted for high throughput protein profiling. In a typical iTRAQ/TMT proteomics study, samples are grouped into batches and each batch is processed by one multiplex experiment, in which the abundances of thousands of proteins/peptides in a batch of samples can be measured simultaneously. The multiplex labelling technique greatly enhances the throughput of protein quantification. However, the technical variation across different iTRAQ/TMT multiplex experiments is often large due to the dynamic nature of MS instruments. This leads to strong batch effects in the iTRAQ/TMT data. Moreover, the iTRAQ/TMT data often contain substantial batch-level non-ignorable missingness. Specifically, the abundance measures of a given protein/peptide are often either observed or missing altogether in all the samples from the same batch, with the missing probability depending on the combined batch-level abundances. We term this unique missing-data mechanism as the Batch-level AbundanceDependent Missing-data mechanism (BADMM). We introduce a new method — mixEMM — for analyzing iTRAQ/TMT data with batch effects and batchlevel non-ignorable missingness. The mixEMMmethod employs a linear mixedeffects model and explicitly models the batch effects and the BADMM in the likelihood function. With simulation studies, we showed that compared with existing approaches that utilize relative abundances and ignore the missing batches under the missing-completely-at-random assumption, the mixEMM method achieves more accurate parameter estimation and inference. We applied the method to an iTRAQ proteomics data from a breast cancer study and identified phosphopeptides differentially expressed between different breast cancer subtypes. The method can be applied to general clustered data with cluster-level non-ignorable missing-data mechanisms.
منابع مشابه
Marginal Analysis of A Population-Based Genetic Association Study of Quantitative Traits with Incomplete Longitudinal Data
A common study to investigate gene-environment interaction is designed to be longitudinal and population-based. Data arising from longitudinal association studies often contain missing responses. Naive analysis without taking missingness into account may produce invalid inference, especially when the missing data mechanism depends on the response process. To address this issue in the ana...
متن کاملNormalization and Statistical Analysis of Multiplexed Bead-based Immunoassay Data Using Mixed-effects Modeling*□S
Multiplexed bead-based flow cytometric immunoassays are a powerful experimental tool for investigating cellular communication networks, yet their widespread adoption is limited in part by challenges in robust quantitative analysis of the measurements. Here we report our application of mixed-effects modeling for the normalization and statistical analysis of bead-based immunoassay data. Our data ...
متن کاملA label-free quantification method by MS/MS TIC compared to SILAC and spectral counting in a proteomics screen.
In order to assess the biological function of proteins and their modifications for understanding signaling mechanisms within cells as well as specific biomarkers to disease, it is important that quantitative information be obtained under different experimental conditions. Stable isotope labeling is a powerful method for accurately determining changes in the levels of proteins and PTMs; however,...
متن کاملAutomated Analysis of Quantitative Image Data Using Isomorphic Functional Mixed Models, with Application to Proteomics Data By
Image data are increasingly encountered and are of growing importance in many areas of science. Much of these data are quantitative image data, which are characterized by intensities that represent some measurement of interest in the scanned images. The data typically consist of multiple images on the same domain and the goal of the research is to combine the quantitative information across ima...
متن کاملDesigning a Talent Management Model for School Principals: A Mixed Approach
The aim of this study was to present a model of talent management in high schools of Tehran. The research was applied and developmental in terms of purpose, and was mixed in terms of data collection. In the qualitative part, the deductive content analysis method was used and in the quantitative part, the survey method was used, specifically based on the structural equation model (SEM). The stat...
متن کامل